Intelligent character recognition

In computer science, intelligent character recognition (ICR) is an advanced optical character recognition (OCR) or — rather more specific — handwriting recognition system that allows fonts and different styles of handwriting to be learned by a computer during processing to improve accuracy and recognition levels.

Most ICR software has a self-learning system referred to as a neural network, which automatically updates the recognition database for new handwriting patterns. It extends the usefulness of scanning devices for the purpose of document processing, from printed character recognition (a function of OCR) to hand-written matter recognition. Because this process is involved in recognising hand writing, accuracy levels may, in some circumstances, not be very good but can achieve 97%+ accuracy rates in reading handwriting in structured forms. Often to achieve these high recognition rates several read engines are used within the software and each is given elective voting rights to determine the true reading of characters. In numeric fields, engines which are designed to read numbers take preference, while in alpha fields, engines designed to read hand written letters have higher elective rights. When used in conjunction with a bespoke interface hub, hand-written data can be automatically populated into a back office system avoiding laborious manual keying and can be more accurate than traditional human data entry.

An important development of ICR was the invention of Automated Forms Processing in 1993. This involved a three stage process of capturing the image of the form to be processed by ICR and preparing it to enable the ICR engine to give best results, then capturing the information using the ICR engine and finally processing the results to automatically validate the output from the ICR engine.

This application of ICR increased the usefulness of the technology and made it applicable for use with real world forms in normal business applications. Modern software applications use ICR as a technology of recognizing text in forms filled in by hand (hand-printed):

Company Products ICR Languages Supported
Parascript Parascript CheckPlus

Parascript AddressScript Parascript FormXtra Parascript FieldScript

English, French, German, Italian, Kazak, Portuguese, Russian and Spanish
A2iA A2iA DocumentReader

A2iA CheckReader A2iA AddressReader A2iA FieldReader

English, French, German, Italian, Portuguese, Spanish and Arabic
ABBYY ABBYY FlexiCapture

ABBYY FlexiCapture Engine

ABBYY FineReader Engine

Afrikaans, Albanian, Aymara, Azerbaijani (Latin), Basque, Bemba, Blackfoot, Breton, Bugotu, Bulgarian, Cebuano, Chamorro, Corsican, Crimean Tatar, Croatian, Crow, Czech, Dakota (Sioux), Dutch (Belgium), Dutch (Netherlands), English, Estonian, Even, Evenki, Fijian, Finnish, French, Frisian, Friulian, Galician, Ganda, German, German (Luxembourg), German (new spelling), Greek, Guarani, Hani, Hausa, Hawaiian, Hungarian, Icelandic, Indonesian, Irish, Italian, Jingpo, Karachay-balkar, Kasub, Kawa, Kazakh, Kirghiz, Kongo, Kpelle, Kumyk, Kurdish, Latin, Latvian, Lithuanian, Luba, Malagasy, Malinke, Maori, Maya, Miao, Minangkabau, Mohawk, Moldavian, Mongol, Mordvin, Nahuatl, Nivkh, Nogay, Nyanja, Ojibway, OldFrench, OldGerman, OldItalian, OldSpanish, Papiamento, Polish, Quechua, Rhaeto-Romanic, Romanian, Romany, Rundi, Russian, Rwanda, Sami (Lappish), Samoan, Scottish Gaelic, Selkup, Serbian (Latin), Slovak, Slovenian, Somali, Sotho, Spanish, Swahili, Swazi, Tagalog, Tahitian, Tok Pisin, Tongan, Tswana, Tun, Turkish, Uigur (Latin), Ukrainian, Wolof, Xhosa, Zapotec, Ido, Interlingua
Accusoft Pegasus SmartZone ICR/OCR English, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Spanish, and Swedish (.NET supports all listed, ActiveX is English only)
Cognitive Technologies Cognitive Forms Russian, ?
ExperVision TypeReader

OpenRTK

English, French, German, Italian, Spanish, Portuguese, Danish, Dutch, Swedish, Norwegian, Hungarian, Polish, Simplified Chinese, Traditional Chinese, Russian, Finnish and Polynesian
I.R.I.S. Group IRISCapture Pro for Forms Latin based languages
LEADTOOLS LEADTOOLS ICR SDK Module Catalan, Czech, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Polish, Portuguese, Spanish, Swedish

Taking ICR to the Next Level

Intelligent word recognition (IWR) can not only recognize and extract printed-handwritten information, but cursive handwriting as well. ICR recognizes on the character-level, whereas IWR works with full words or phrases. Capable of capturing unstructured information from every day pages, IWR is said to be more evolved than hand print ICR (according to the CCA (Committee for Capturing Abstractions)).

Not meant to replace conventional ICR and OCR systems, IWR is optimized for processing real-world documents that contain mostly free-form, hard-to-recognize data fields that are inherently unsuitable for ICR. This means that the highest and best use of IWR is to eliminate a high percentage of the manual entry of handwritten data and run-on hand print fields on documents that otherwise could be keyed only by humans.

See also